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DETAILED ACTION 

1 . This communication is in response to the Amendments and Arguments filed on 
06/29/2009. Claims 1-22 are pending and have been examined. The Applicants' 
amendment and remarks have been carefully considered, but they do not place the 
claims in condition for allowance. 

2. All previous objections and rejections directed to the Applicant's disclosure and 
claims not discussed in this Office Action have been withdrawn by the Examiner. 



Response to Amendments and Arguments 
3. Applicant's arguments (pages 7-1 1) filed 06/29/2009 with regard to claims 1-22 
have been fully considered and they are not persuasive and with respect to the newly 
added limitation are moot in view of new grounds for rejection. 

With respect to claim 1 , the Applicant argues that the Examiner has 
misinterpreted the claim language as defined in the Specification, for the terminology of 
dramatic parameters. The Examiner respectfully disagrees with this assertion. The 
Specification in paragraphs [0010] and [0035] as denoted by the Applicant in the 
Remarks sections show examples of dramatic parameters such as key tempo and 
mood. It should be noted that the dramatic parameters to be read in a narrow version 
must be specifically defined in the Specification. However, this is not the case since in 
paragraph [0038], the section describes that the dramatic parameters that have been 
defined are examples and further defines such parameters to be markup language tags 
as well as other attributes, where attributes have not been define by the Specification. 
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Hence, the usage of pitch information and frequency information as disclosed by Finn in 
view of Mitton in light of this definition can be interpreted to be dramatic parameters. 
Finn in Figure 3A and page 6, paragraphs 3 and 4 such time ordered sequence of 
pitches are extracted. These pitches are attributes that characterize the signal. 
Furthermore, Mitton teaches these pitches are in the form of a table in col. 5, lines 12- 
22. Hence, the Applicants' arguments are not persuasive. 

The newly added limitation of "in tandem with said audio signal" necessitates 
new grounds for rejection. 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 1-3, 5, 7-10, 16-22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Finn (WO 01/1 1495) in view of Mitton (US 6,355,869) in view of 
Niikuraetal. (J P 06-068168). 

As to claims 1 , 17, 21 , and 22: Finn discloses augmenting an audio signal (see 
Figure 1) comprising: 

receiving an audio signal (Figure 2, input search criteria steps 20 and 21) 
extracting features from said audio signal (see Figure 2 step 22, identify 

pitch of successive notes). 
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generating a time based table of dramatic parameters according to the 
extracted features (see page 6, 3rd and 4th paragraph, and see Figure 3A, from 
the cited portion and the figure it can be seen that a time ordered sequence of 
pitches are extracted), and 

obtaining media fragments at least in part in dependence on the table of 
dramatic parameters (see page 15, line 10-33, comparison between query and 
database based on similarity) files and wherein the media fragments are 
unrelated to the audio signal prior to the obtaining act (see page 3, lines 5-7, only 
a search criteria in input and a target file is retrieved, which is not the same as 
the audio input above), and 

outputting said media fragments (see page 21, lines 21-28, music file 
output or list displayed to user) 

However, Finn does not specifically teach a time-ordered table. 

Mitton does teach a time- ordered table (see Col. 5, lines 12-22, where 
Mitton discusses a pseudo wave file with a series of pitch coefficients for each 
frame and Figure 33). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio search as taught by Finn, and use a 
time-ordered table as taught by Mitton, thus allowing a user to produce a musical 
score from a recording, as discussed by Mitton (see Col. 1, lines 55-60). 

However, Finn in view of Mitton do not specifically teach outputting in 
tandem with said audio signal. 
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Niikura does teach the outputting in tandem with said audio signal (see 
[0010], and [0024], where based on an input speech from user corresponding 
video and sound information containing the input speech (keyword) is outputted). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio search as taught by Finn in view of 
Mitten, and retrieve video as taught by Niikura, thus allowing a user to find 
images based on sound for easy retrieval (see Niikura, [0008]). 

As to claim 2, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Finn teaches features extracted from said audio signal 
include tempo (see page 25, lines 2, 15, key and tempo determined from the 
input and is used in search criteria (see page 23 lines 22-26, used in first pass 
matching). 

As to claim 3, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Mitton does teach generation of a time- ordered table (see 
Col. 5, lines 12-22, where Mitton discusses a pseudo wave file with a series of 
pitch coefficients for each frame and Figure 33). 

Furthermore, Finn discloses the table of dramatic parameters comprises 
retrieving a list of dramatic parameters and associated audio features (see page 
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1 1 , lines 8-10, features from data are compared with search criteria, where the 
matching criteria of dramatic parameters is shown in page 25, lines 2 and 15), 
comparing and matching the extracted features with the retrieved associated 
audio features (see page 1 1 , lines 8-10, features from data are compared with 
search criteria), and inserting an entry comprising the dramatic parameter 
associated with the audio feature (see page 23, lines 21 -page 25, lines 16, 
various criteria are determined in order to determined match, where the 
determination of the dramatic parameter is the inserting for matching purposes.) 

As to claim 5, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Finn teaches obtaining said media fragments comprises 
selecting a fragment from a store (see page 1 1 , lines 8-1 0,m music files in 
database 9 and 10 used), said fragment being stored with an associated 
dramatic parameter which matches the respective entry in the table of dramatic 
parameters (see page 25, lines 2, 15, key and tempo determined from the input 
and is used in search criteria (see page 23 lines 22-26, used in first pass 
matching) . 

As to claim 7, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 5, above. 
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Furthermore, Finn teaches receiving user input, said user input affecting 
said obtaining (see page 6, lines 13-18, user inputs a voice or a tune and see 
page 3, lines 1-8, based on user input a matching music is obtained). 

As to claim 8, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Niikura teaches the media fragments being video data, (see 
[0010], video data) 



As to claim 9, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Finn teaches Furthermore, Finn teaches further comprising 
storage for storing said media fragments (see page 1 1 , line 9, database 9 or 10). 

Furthermore, it would have been obvious to one of ordinary skilled in the 
art to have stored the audio signal at least temporarily as well in order to perform 
the extraction of features from the audio signal for comparison (see Finn, page 
6, lines 15-19). 



As to claim 10, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Finn teaches wherein said outputting comprises rendering 
said media fragments and said audio signal (see page 21 , lines 26-29, link to the 
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media fragment is displayed wliich the user can select to hear. It is obvious that 
the computer system consists of a built in speaker to hear such results 
corresponding to the tune of the search query. Hence, the rendering of the audio 
signal occurs by the rendering of a match that is found similar to the tune that 
was input.) 

As to claim 16, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore, Finn teaches wherein combinations of extracted features 
have associated dramatic parameters (see page 25, line 15 and 16, where the 
tempo is based on mean note durations in seconds, i.e., the mean of the pitches 
for a specific duration determines the dramatic parameter). 

As to claims 18 and 22, Finn in view of Mitton in view of Niikura teach all of the 
limitations as in claim 17 and 21, above. 

Furthermore, Finn teaches further comprising storage for storing said 
media fragments (see page 1 1 , line 9, database 9 or 10). 

Furthermore, Mitton teaches storing the dramatic parameters ( see col. 5, 
lines 22-32, list of event and lines 35 where the MIDI file is created and is obvious 
it will be stored (see Abstract). 
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As to claim 19, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 17, above. 

Furthermore, Finn teaches wherein said at least one output device 
comprises display means on which said media fragments are displayed (see 
page 6, lines 5, monitor 4, and page 21, lines 25-26, user presented with search 
results.). 

As to claim 20, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 7, above. 

Furthermore, Finn teaches wherein said at least one output device 
comprises display means on which said media fragments are displayed (see 
page 6, lines 5, monitor 4, and page 21, lines 25-26, user presented with search 
results.). 

Furthermore, Mitton teaches the output device responsive to instructions 
associated with said dramatic parameters (see col. 10, lines 6-9, user can modify 
the event list). 

5. Claim 4 is rejected under 35 U.S.C. 103(a) as being unpatentable over Finn in 
view of Mitton in view of Niikura as applied to claim 1 above, and further in view of 
Weare (US 2003/0045954). 

As to claim 4, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 
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However, Finn in view of Mitton in view of Niil<ura does not teacli the 
parameters being mood, cliange of pace incidents. 

Weare does teacli use of parameters mood (see [0095], mood), change of 
pace (see [0066], flow)) and incidents (see [0066], rhythmic activity]). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio searching of Finn in view of Mitton in 
view of Niikura, and use video as taught by Weare, for the classification of media 
entities according to melodic properties (see Weare [0002]). 

6. Claims 6 and 11-14 are rejected under 35 U.S.C. 103(a) as being unpatentable 
over Finn in view of Mitton in view of Niikura as applied to claim 1 above, and further in 
view of Balnaves (US 6,954,894). 

As to claim 6, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

However, Finn in view of Mitton in view of Niikura do not specifically teach 
generating fragments. 

Balnaves teaches generating a fragment (see col. 11, lines 12-29, where 
the user input is modified to form a fragment depending on template selected, 
silent movie is chosen). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio searching of Finn in view of Mitton in 
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view of Niikura, and use video as tauglit by Balnaves, for effectively controlling 
and editing multimedia output (see Balnaves , col. 1, lines 7-11). 

As to claim 1 1 , Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

Furthermore Finn in view of Mitton in view of Niikura teach dramatic 
parameter data, matching dramatic parameters to media fragments, and 
selecting and generating according to dramatic parameter lists. 

However, Finn in view of Mitton in view of Niikura do not specifically teach 
the story template 

Balnaves teaches prior to obtaining said media segments, selecting a 
story template (see col. 8, lines 27-30, user selects template) at least in part in 
dependence on said table of dramatic parameter (see col. 8, lines 54-60, 
templates used to evoke action or intent and see Figure 12 and 13, where each 
type of movie has a specific template), said story template affecting said 
obtaining of media fragments (see Figure 5, 501 and 508, template and movie 
player, output of processed data) (e.g. The template chosen affects the output 
data). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio searching of Finn in view of Mitton in 
view of Niikura, and use video as taught by Balnaves, for effectively controlling 
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and editing multimedia output (see Balnaves , col. 1, lines 7-11). 

As to claim 12, Finn in view of Mitton in view of Niikura in view of Balnaves teach 
all of the limitations as in claim 1 , above. 

Furthermore, Finn teaches the sue of dramatic parameters (see page 6, 
3rd and 4th paragraph, and see Figure 3A, from the cited portion and the figure it 
can be seen that a time ordered sequence of pitches are extracted) 

Furthermore, Balnaves teaches wherein said story template comprises 
dramatic parameter data related to a nanrative story structure (see Figure 12 and 
Figure 13, each type of template movie selected consists of various parameters. 

As to claim 13, Finn in view of Mitton in view of Niikura in view of Balnaves teach 
all of the limitations as in claim 1 , above. 

Furthermore, Finn teaches matching the dramatic parameters with the 
media fragments features (see page 11, lines 8-10, features from data are 
compared with search criteria, where the matching criteria of dramatic 
parameters is shown in page 25, lines 2 and 15), 

Furthermore, Balnaves teaches using a story template comprises dramatic 
parameter data related to a narrative story structure (see Figure 12 and Figure 
13, each type of template movie selected consists of various parameters. 
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As to claim 14, Finn in view of Mitton in view of Niikura in view of Balnaves teacli 
all of the limitations as in claim 1 , above. 

Furthermore, Balnaves teaches wherein the story template for selection is 
generated according to logical story structure rules and the dramatic parameter 
list (see Figures 12 and 13, where sample template is shown) (e.g. From the 
Figure, it is obvious to one skilled in the art that the templates were generated 
beforehand. Based on movie genre or user preferences related to the genre.) 

7. Claim 15 is rejected under 35 U.S.C. 103(a) as being unpatentable over Finn in 
view of Mitton in view of Niikura as applied to claim 1 above, and further in view of 
Williams (US 6,308,154). 

As to claim 15, Finn in view of Mitton in view of Niikura teach all of the limitations 
as in claim 1 , above. 

However, Finn in view of Mitton in view of Niikura do not specifically teach 
the use of physical markup language tags. 

Williams teaches instruction set of a markup language (see Col. 3, lines 2- 
8, where Williams discusses attributes are encoded using a markup language 
and markup indicators). 

It would have been obvious to one skilled in the art at the time the 
invention was made to modify the audio searching as taught by Finn in view of 
Mitton in view of Niikura, and use instruction set of a markup language as taught 
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by Williams, thus allowing measurement and encoding of recognized content, as 
discussed by Williams (see Col. 1, lines 52-57). 

Conclusion 

8. Applicant's amendment necessitated the new ground(s) of rejection presented in 
this Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP 

§ 706.07(a). Applicant is reminded of the extension of time policy as set forth in 37 
CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within 
TWO MONTHS of the mailing date of this final action and the advisory action is not 
mailed until after the end of the THREE-MONTH shortened statutory period, then the 
shortened statutory period will expire on the date the advisory action is mailed, and any 
extension fee pursuant to 37 CFR 1.136(a) will be calculated from the mailing date of 
the advisory action. In no event, however, will the statutory period for reply expire later 
than SIX MONTHS from the date of this final action. 

9. The prior art made of record and not relied upon is considered pertinent to 
applicant's disclosure. 

Kanevsky et al. (US 6,434,520) is cited to disclose indexing and querying of 
audio archives. Brinkman et al. (US 6,740,803) is cited to disclose multimedia 
presentation of an audio file for playing a musical instrument. 
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The NPL document by Lu ("Indexing and Retrieval of Audio: A Survey") is cited to 
disclose retrieval of audio using speaker characteristics. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to PARAS SHAH whose telephone number is (571)270- 
1650. The examiner can normally be reached on MON.-THURS. 7:30a. m.-4:00p.m. 
EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, David Hudspeth can be reached on (571)272-7843. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for 
published applications may be obtained from either Private PAIR or Public PAIR. 
Status information for unpublished applications is available through Private PAIR only. 
For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 
you have questions on access to the Private PAIR system, contact the Electronic 
Business Center (EBC) at 866-217-9197 (toll-free). If you would like assistance from a 
USPTO Customer Service Representative or access to the automated information 
system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 
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Supervisory Patent Examiner, Art Unit 2626 

/P. S./ 

Examiner, Art Unit 2626 
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